Relative Compositionality of Multi-word Expressions: A Study of Verb-Noun (V-N) Collocations

نویسندگان

  • Sriram Venkatapathy
  • Aravind K. Joshi
چکیده

Recognition of Multi-word Expressions (MWEs) and their relative compositionality are crucial to Natural Language Processing. Various statistical techniques have been proposed to recognize MWEs. In this paper, we integrate all the existing statistical features and investigate a range of classifiers for their suitability for recognizing the non-compositional Verb-Noun (V-N) collocations. In the task of ranking the V-N collocations based on their relative compositionality, we show that the correlation between the ranks computed by the classifier and human ranking is significantly better than the correlation between ranking of individual features and human ranking. We also show that the properties ‘Distributed frequency of object’ (as defined in [27]) and ‘Nearest Mutual Information’ (as adapted from [18]) contribute greatly to the recognition of the non-compositional MWEs of the V-N type and to the ranking of the V-N collocations based on their relative compositionality.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Measuring the Relative Compositionality of Verb-Noun (V-N) Collocations by Integrating Features

Measuring the relative compositionality of Multi-word Expressions (MWEs) is crucial to Natural Language Processing. Various collocation based measures have been proposed to compute the relative compositionality of MWEs. In this paper, we define novel measures (both collocation based and context based measures) to measure the relative compositionality of MWEs of V-N type. We show that the correl...

متن کامل

Lexical and Grammatical Collocations in Writing Production of EFL Learners

Lewis (1993) recognized significance of word combinations including collocations by presenting lexical approach. Because of the crucial role of collocation in vocabulary acquisition, this research set out to evaluate the rate of collocations in Iranian EFL learners' writing production across L1 and L2. In addition, L1 interference with L2 collocational use in the learner' writing samples was st...

متن کامل

Using Distributional Similarity of Multi-way Translations to Predict Multiword Expression Compositionality

We predict the compositionality of multiword expressions using distributional similarity between each component word and the overall expression, based on translations into multiple languages. We evaluate the method over English noun compounds, English verb particle constructions and German noun compounds. We show that the estimation of compositionality is improved when using translations into m...

متن کامل

Shared Task System Description: Measuring the Compositionality of Bigrams using Statistical Methodologies

The measurement of relative compositionality of bigrams is crucial to identify Multi-word Expressions (MWEs) in Natural Language Processing (NLP) tasks. The article presents the experiments carried out as part of the participation in the shared task ‘Distributional Semantics and Compositionality (DiSCo)’ organized as part of the DiSCo workshop in ACLHLT 2011. The experiments deal with various c...

متن کامل

Combining Different Features of Idiomaticity for the Automatic Classification of Noun+Verb Expressions in Basque

We present an experimental study of how different features help measuring the idiomaticity of noun+verb (NV) expressions in Basque. After testing several techniques for quantifying the four basic properties of multiword expressions or MWEs (institutionalization, semantic non-compositionality, morphosyntactic fixedness and lexical fixedness), we test different combinations of them for classifica...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2005